NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

BBCaL: Black-box Backdoor Detection under the Causality Lens

Hu, Mengxuan; Guan, Zihan; Guo, Junfeng; Zhou, Zhongliang; Zhang, Jielu; Li, Sheng (December 2024, Transactions on Machine Learning Research)

Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks, where attackers can inject hidden backdoors during the training stage. This poses a serious threat to the Model-as-a-Service setting, where downstream users directly utilize third-party models (e.g., HuggingFace Hub, ChatGPT). To this end, we study the inference-stage black-box backdoor detection problem in the paper, where defenders aim to build a firewall to filter out the backdoor inputs in the inference stage, with only input samples and prediction labels available. Existing investigations on this problem either rely on strong assumptions on types of triggers and attacks or suffer from poor efficiency. To build a more generalized and efficient method, we first provide a novel causality-based lens to analyze heterogeneous prediction behaviors for clean and backdoored samples in the inference stage, considering both sample-specific and sample-agnostic backdoor attacks. Motivated by the causal analysis and do-calculus in causal inference, we introduce Black-box Backdoor detection under the Causality Lens (BBCaL) which distinguishes backdoor and clean samples by analyzing prediction consistency after progressively constructing counterfactual samples. Theoretical analysis also sheds light on the effectiveness of the BBCaL. Extensive experiments on three benchmark datasets validate the effectiveness and efficiency of our method.
more » « less
Full Text Available
Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data

Hu, Mengxuan; Guan, Zihan; Zeng, Yi; Guo, Junfeng; Zhou, Zhongliang; Zhang, Jielu; Jia, Ruoxi; Vullikanti, Anil; Li, Sheng (January 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available January 22, 2026
Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data

Hu, Mengxuan; Guan, Zihan; Zeng, Yi; Guo, Junfeng; Zhou, Zhongliang; Zhang, Jielu; Jia, Ruoxi; Vullikanti, Anil Kumar; Li, Sheng (January 2025, International Conference on Learning Representations)

Anti-backdoor learning, aiming to train clean models directly from poisoned datasets, serves as an important defense method for backdoor attack. However, existing methods usually fail to recover backdoored samples to their original, correct labels and suffer from poor generalization to large pre-trained models due to its non end-to end training, making them unsuitable for protecting the increasingly prevalent large pre-trained models. To bridge the gap, we first revisit the anti-backdoor learning problem from a causal perspective. Our theoretical causal analysis reveals that incorporating both images and the associated attack indicators preserves the model's integrity. Building on the theoretical analysis, we introduce an end-to-end method, Mind Control through Causal Inference (MCCI), to train clean models directly from poisoned datasets. This approach leverages both the image and the attack indicator to train the model. Based on this training paradigm, the model’s perception of whether an input is clean or backdoored can be controlled. Typically, by introducing fake non-attack indicators, the model perceives all inputs as clean and makes correct predictions, even for poisoned samples. Extensive experiments demonstrate that our method achieves state-of-the-art performance, efficiently recovering the original correct predictions for poisoned samples and enhancing accuracy on clean samples.
more » « less
Free, publicly-accessible full text available January 22, 2026
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation

https://doi.org/10.1145/3626772.3657673

Zhou, Zhongliang; Zhang, Jielu; Guan, Zihan; Hu, Mengxuan; Lao, Ni; Mu, Lan; Li, Sheng; Mai, Gengchen (July 2024, ACM)

Full Text Available

Search for: All records